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Abstract — Energy consumption imposes a significant cost for data centers; yet much of that energy is used to maintain excess service 
capacity during periods of predictably low load. Resultantly there has recently been interest in developing designs that allow the 
service capacity to be dynamically resized to match the current workload. However, there is still much debate about the value of such 
approaches in real settings. In this paper, we show that the value of dynamic resizing is highly dependent on statistics of the workload 
process. In particular, both slow time-scale non-stationarities of the workload (e.g., the peak-to-mean ratio) and the fast time-scale 
stochasticity (e.g., the burstiness of arrivals) play key roles. To illustrate the impact of these factors, we combine optimization-based 
modeling of the slow time-scale with stochastic modeling of the fast time scale. Within this framework, we provide both analytic and 
numerical results characterizing when dynamic resizing does (and does not) provide benefits. 
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1 Introduction 

Energy costs represent a significant, and growing, frac- 
tion of a data center's budget. Hence there is a push to 
improve the energy efficiency of data centers, both in 
terms of the components (servers, disks, network |30|, 
0, E, HI) and the algorithms U, El, ED, ESJ. One 
specific aspect of data center design that is the focus of 
this paper is dynamically resizing the service capacity of 
the data center so that during periods of low load some 
servers are allowed to enter a power-saving mode (e.g., 
go to sleep or shut down). 

The potential benefits of dynamic resizing have been 
a point of debate in the community IIT91 , 1111 , [31J. On 
one hand, it is clear that, because data centers are far 
from perfectly energy proportional, significant energy 
is used to maintain excess capacity during periods of 
predictably low load when there is a diurnal workload 
with a high peak-to-mean ratio. On the other hand, there 
are also significant costs to dynamically adjusting the 
number of active servers. These costs come in terms of 
the engineering challenges in making this possible |13|, 
[35 1, [5J, as well as the latency, energy, and wear-and-tear 
costs of the actual "switching" operations involved |7|, 
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The challenges for dynamic resizing highlighted above 
have been the subject of significant research. At this 
point, many of the engineering challenges associated 
with facilitating dynamic resizing have been resolved, 
e.g., |13], |35|, |5|. Additionally, the algorithmic challenge 
of deciding, without knowledge of the future workload, 
whether to incur the significant "switching costs" as- 
sociated with changing the available service capacity 
has been studied in depth and a number of promising 
algorithms have emerged [26], J3J, IITT1 , [16 J. 

However, despite this body of work, the question of 
characterizing the potential benefits of dynamic resizing 
has still not been properly addressed. Providing new 
insight into this topic is the goal of the current paper. 

The perspective of this paper is that, apart from en- 
gineering challenges, the key determinant of whether 
dynamic resizing is valuable is the workload, and that 
proponents on different sides tend to have different as- 
sumptions in this regard. In particular, a key observation, 
which is the starting point for our work, is that there 
are two factors of the workload which provide dynamic 
resizing potential savings: 

(i) Non-stationarities at a slow time-scale, e.g., diurnal 
workload variations. 

(ii) Stochastic variability at a fast time-scale, e.g., the 
burstiness of request arrivals. 

The goal of this work is to investigate the impact of and 
interaction between these two features with respect to 
dynamic resizing. 

To this point, we are not aware of any work character- 
izing the benefits of dynamic resizing that captures both 
of these features. There is one body of literature which 
provides algorithms that take advantage of (i), e.g., |11|, 
[10], [26 1, |3|. This work tends to use an optimization- 
based approach to develop dynamic resizing algorithms. 
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There is another body of literature which provides algo- 
rithms that take advantage of (ii), e.g., Ill6l , lTl7| . This 
work tends to assume a stationary queueing model with 
Poisson arrivals to develop dynamic resizing algorithms. 

The first contribution of the current paper is to provide 
an analytic framework that captures both effects (i) 
and (ii). We accomplish this by using an optimization 
framework at the slow time-scale (see Section [2), which 
is similar to that of |26], and combining this with stochas- 
tic network calculus and large deviations modeling for 
the fast time-scale (see Section [3b, which allows us to 
study a wide variety of underlying arrival processes. 
We consider both light-tailed models with various de- 
grees of burstiness and heavy-tailed models that exhibit 
self-similarity. The interface between the fast and slow 
time-scale models happens through a constraint in the 
optimization problem that captures the Service Level 
Agreement (SLA) for the data center, which is used by 
the slow time-scale model but calculated using the fast 
time-scale model (see Section [3} . 

Using this modeling framework, we are able to pro- 
vide both analytic and numerical results that yield new 
insight into the potential benefits of dynamic resizing 
(see Section |4j. Specifically, we use trace-driven nu- 
merical simulations to study (i) the role of burstiness 
for dynamic resizing, (ii) the role of the peak-to-mean 
ratio for dynamic resizing, (iii) the role of the SLA for 
dynamic resizing, and (iv) the interaction between (i), 
(ii), and (iii). The key realization is that each of these 
parameters are extremely important for determining the 
value of dynamic resizing. In particular, for any fixed 
choices of two of these parameters, the third can be 
chosen so that dynamic resizing does or does not pro- 
vide significant cost savings for the data center. Thus, 
performing a detailed study of the interaction of these 
factors is important. To that end, Figures [I2p4 provide 
concrete illustrations of which settings of peak-to-mean 
ratio, burstiness, and SLAs dynamic resizing is and is 
not valuable. Hence, debate about the potential value of 
dynamic resizing can be transformed into debate about 
characteristics of the workload and the SLA. 

There are some interesting facts about these param- 
eters individually that our case studies uncover. Two 
important examples are the following. First, while one 
might expect that increased burstiness provides in- 
creased opportunities for dynamic resizing, it turns out 
the burstiness at the fast time-scale actually reduces the 
potential cost savings achievable via dynamic resizing. 
The reason is that dynamic resizing necessarily happens 
at the slow time-scale, and so the increased burstiness at 
the fast time-scale actually results in the SLA constraint 
requiring more servers be used at the slow time-scale due 
to the possibility of a large burst occurring. Second, it 
turns out the impact of the SLA can be quite different 
depending on whether the arrival process is heavy- or 
light-tailed. In particular, as the SLA becomes more strict, 
the cost savings possible via dynamic resizing under 
heavy-tailed arrivals decreases quickly; however, the 



cost savings possible via dynamic resizing under light- 
tailed workloads is unchanged. 

In addition to detailed case studies, we provide ana- 
lytic results that support many of the insights provided 
by the numerics. In particular, Theorems [T] and [2] provide 
monotonicity and scaling results for dynamic resizing in 
the case of Poisson arrivals and heavy-tailed, self -similar 
arrivals. 

The remainder of the paper is organized as follows. 
The model is introduced in Sections [2] and [3] where 
Section [2] introduces the optimization model of the slow 
time-scale and Section [3] introduces the model of the fast 
time-scale and analyzes the impact different arrival mod- 
els have on the SLA constraint of the dynamic resizing 
algorithm used in the slow time-scale. Then, Section [4] 
provides case studies and analytic results characterizing 
the impact of the workload on the benefits of dynamic 
resizing. The related proofs are presented in Section [5] 
Finally, Section [6] provides concluding remarks. 

2 Slow Time-scale Model 

In this section and the one that follows, we introduce 
our model. We start with the "slow time-scale model". 
This model is meant to capture what is happening at 
the time-scale of the data center control decisions, i.e., at 
the time-scale which the data center is willing to adjust 
its service capacity. For many reasons, this is a much 
slower time-scale than the time-scale at which requests 
arrive to the data center. We provide a model for this 
"fast time-scale" in the next section. 

The slow time-scale model parallels closely the model 
studied in Il26l . The only significant change is to add 
a constraint capturing the SLA to the cost optimization 
solved by the data center. This is a key change, which 
allows an interface to the fast time-scale model. 



2.1 The Workload 

At this time-scale, our goal is to provide a model which 
can capture the impact of diurnal non-stationarities in 
the workload. To this end, we consider a discrete-time 
model such that there is a time interval of interest 
which is evenly divided into "frames" k € {1, ...,K}. In 
practice, the length of a frame could be on the order 
of 5-10 minutes, whereas the time interval of interest 
could be as long as a month/ year. The mean arrival 
rate to the data center in frame k is denoted by Xk, 
and non-stationarities are captured by allowing different 
rates during different frames. Although we could allow 
Afe to have a vector value to represent more than one 
type of workload as long as the resulting cost function 
is convex in our model, we assume to have a scalar 
value in this paper to simplify the presentation. Because 
the request inter-arrival times are much shorter than the 
frame length, typically in the order of 1-10 seconds, ca- 
pacity provisioning can be based on the average arrival 
rate during a frame. 
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2.2 The Data Center Cost Model 

The model for data center costs focuses on the server 
costs of the data center, as minimizing server energy con- 
sumption also reduces cooling and power distribution 
costs. We model the cost of a server by the operating 
costs incurred by an active server, as well as the switch- 
ing cost incurred to toggle a server into and out of a 
power-saving model (e.g., off /on or sleeping /waking). 
Both components can be assumed to include energy cost, 
delay cost, and wear-and-tear cost. See |26| and [25] for 
further discussion of the model. 

Note that this model ignores many issues surrounding 
reliability and availability, which are key components of 
data center service level agreements (SLAs). In practice, 
a solution that toggles servers must still maintain the 
reliability and availability guarantees; however this is 
beyond the scope of the current paper. See [35J for a 
discussion. 

The Operating Cost 

The operating costs are modeled by a convex function 
which is the same for all the servers, where 
Xi t k denotes the average arrival rate to server i during 
frame k. The convexity assumption is quite general and 
captures many common server models. One example, 
which we consider in our numeric examples later, is 
to say that the operating costs are simply equal to the 
energy cost of the server, i.e., the energy cost of an 
active server handling arrival rate Xi lk . This cost is often 
modeled using an affine function as follows 

/(Aj,fc) = e + ex\i >k , (1) 

where eo and e± are constants (U, (5), 1123 1 . Note that 
when servers use dynamic speed scaling, if the energy 
cost is modeled as polynomial in the chosen speed, the 
cost /(■) remains convex. In practice, we expect that /(•) 
will be empirically measured by observing the system 
over time. 

The Switching Cost 

The switching cost, denoted by /3, models the cost of 
toggling a server back-and-forth between active and 
power-saving models. The switching cost includes the 
costs of the energy used toggling a server, the delay in 
migrating connections / data when toggling a server, and 
the increased wear-and-tear on the servers toggling. 

2.3 The Data Center Optimization 

Given the cost model above, the data center has two 
control decisions at each time: determining n k , the num- 
ber of active servers in every time frame, and assigning 
arriving jobs to servers, i.e., determining Aj^ such that 
£"=i^i,fc = ^k- All servers are assumed to be homo- 
geneous with constant rate capacity \i > 0. Modeling 
heterogeneous servers is possible but the online problem 
will become more complicated 1 25 ] , which is out of the 
scope of this paper. 



The goal of the data center is to determine n k and 
Aj,fc to minimize the cost incurred during [0, if], which 
is modeled as follows: 

K n k K 

min^^/(A i>fe )+/3^(n fe -n fe _ 1 )+ (2) 
fe=i i=i fe=i 

r o < \ t . k < A fe 

s.t < X^A^fc = A fe (3) 
[ V(D k >D)<e , 

where the final constraint is introduced to capture the 
SLA of the data center. We use D k to represent the 
steady-state delay during frame k, and (D, e) to represent 
an SLA of the form "the probability of a delay larger than 
D must be bounded by probability e" . 

This model generalizes the data center optimization 
problem from |26| by accounting for the additional SLA 
constraint. The specific values in this constraint are 
determined by the stochastic variability at the fast time- 
scale. In particular, we derive (for a variety of workload 
models) a sufficient constraint n k > Ck ^-^ such that 

nk > P{Dk >D)<s. (4) 

A* 

Here, [i is the constant rate capacity of each server and 
C k (D,e) is to be determined for each considered arrival 
model. One should interpret C k (D,e) as the overall 
effective capacity /bandwidth needed in the data center 
such that the SLA delay constraint is satisfied within 
frame k. 

Note that the new constraint is only sufficient for the 
original SLA constraint. The reason is that C k (D,e) will 
be computed, in the next section, from upper bounds on 
the distribution of the transient delay within a frame. 

With the new constraint, however, the optimization 
problem in ||2j can be considerably simplified. Indeed, 
note that n k is fixed during each time frame k and 
the remaining optimization for X^ k is convex. Thus, we 
can simplify the form of the optimization problem by 
using the fact that the optimal dispatching strategy A* k 
is load balancing, i.e., \\ k = \* 2 k = ... = X k /n k . This 
decouples dispatching A* k from capacity planning n k , 
and so Eqs. |2]|-l(3| become: 

Data Center Optimization Problem 

K K 

mm.y^nkf(X k /n k ) + /3^(n k - n fc _i) + (5) 

fe=l k =i 

C k (D,e) 
s.t. n k > . 

A* 

Note that ^ is a convex optimization, since 
n k f(X k /n k ) is the perspective function of the convex 
function /(•). 

As we have already pointed out, the key difference 
between the optimization above, and that of 112611 , is the 
SLA constraint. However, this constraint plays a key role 
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in the current paper. It is this constraint that provides a 
bridge between the slow time-scale and fast time-scale 
models. Specifically, the fast time-scale model uses large 
deviations and stochastic network calculus techniques to 
calculate C'k(D, e). 

2.4 Algorithms for Dynamic Resizing 

Though the Data Center Optimization Problem de- 
scribed above is convex, in practice it must be solved 
online, i.e., without knowledge of the future workload. 
Thus, in determining rik, the algorithm may not have 
access to the future arrival rates A/ for I > k. This fact 
makes developing algorithms for dynamic resizing chal- 
lenging. However, progress has been made recently |26|, 
EI- 

Deriving algorithms for this problem is not the goal 
of the current paper. Thus, we make use of a recent 
algorithm called Lazy Capacity Provisioning (LCP) |26|. 
We choose LCP because of the strong analytic per- 
formance guarantees it provides - LCP provides cost 
within a factor of 3 of optimal for any (even adversarial) 
workload process. 

LCP works as follows. Let {n kl , . . . ,n k k ) be the solu- 
tion vector to the following optimization problem 



k k 

min^ nif(\i/m) + /3y^(n; 

1=1 

Ci (D,e) 



ni-i 



) + 



i=i 



S.t. Tlx > 



n = . 



Similarly, let (i 



'k,l> ' 



. , n kk ) be the solution vector to the 



following optimization problem 

k k 

Ci(D,e) 



ni 



1=1 



i=i 



S.t. 71; > 



fj, 



n = . 



Denote (n)„ = max(min(n, b), a) as the projection of n 
into the closed interval [a,b]. Then LCP can be defined 
using n kk and n kk as follows. Informally, LCP stays 
"lazily" between the upper bound n kk and the lower 
bound n k k in all frames. 

Lazy Capacity Provisioning, LCP 



Let n LCP = « CP , 



denote the vector of active 



servers under LCP. This vector can be calculated online using 
the following forward recurrence relation: 



LCP 



0, 

(n L k c n 



l k,k 



k<0 

1 < k < K 



Note that, in Il26ll , LCP is introduced and analyzed for 
the optimization from Eq. |5} without the SLA constraint. 
However, it is easy to see that the algorithm and perfor- 
mance guarantee extends to our setting. Specifically, the 
guarantees on LCP hold in our setting because the SLA 



constraint can be removed by defining the operating cost 
to be oo instead of nkf(Xk/n>k) when < Ck{D 7 e)/ fi. 

A last point to highlight about LCP is that, as de- 
scribed, it does not use any predictions about the work- 
load in future frames. Such information could clearly be 
beneficial, and can be incorporated into LCP if desired, 
see 1261. 



3 Fast Time-scale Model 

Given the model of the slow time-scale in the previous 
section, we now zoom in to give a description for the 
fast time-scale model. By "fast" time-scale, we mean the 
time-scale at which requests arrive, as opposed to the 
"slow" time-scale at which dynamic resizing decisions 
are made by the data center. To model the fast time-scale, 
we evenly break each frame from the slow time-scale 
into "slots" t G {1, . . . , U}, such that framejength = U ■ 
slot_length. 

We consider a variety of models for the workload 
process at this fast time-scale, including both light-tailed 
models with various degrees of burstiness, as well as 
heavy-tailed models that exhibit self-similarity. In all 
cases, our assumption is that the workload is stationary 
over the slots that make up each time frame. 

The goal of this section is to derive the value of 
Ck(D, e) in the constraint n k > Ck( -^^ from Eq. and 
thus enable an interface between the fast and slow time- 
scales by parameterizing the Data Center Optimization 
Problem from Eq. |5j for a broad range of workloads. 

Note that throughout this section we suppress frame's 
subscript k for rik, A^, C^, and Dk, and focus on a generic 
frame. 

Our approach for deriving the SLA constraint for the 
Data Center Optimization Problem will be to first derive 
an "aggregation property" which allows the data center 
to be modeled as a single server, and to then derive 
bounds on the distribution of the transient delay under 
a variety of arrival processes. 

3.1 An Aggregation Property 

Note that, if the arrival process were modeled as Poisson 
and job sizes were exponential, then an "aggregation 
property" would be immediate, since the response time 
distribution only depends on the load. Hence the SLA 
could be derived by considering a single server. Outside 
of this simple case, however, we need to derive a suitable 
single server approximation. 

The aggregation result that we derive and apply is 
formulated in the framework of stochastic network cal- 
culus 1 9 J, and so we begin by briefly introducing this 
framework. 

Denote the cumulative arrival (workload) process at 
the data center's dispatcher by A(t). That is, for each slot 
t = 1, . . . , U, A(t) counts the total number of jobs arrived 
in the time interval [0, t]. Depending on the total number 
n of active servers the arrival process is dispatched 
into the sub-arrival processes Ai(t) with i = 1, ...,n. 
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The cumulative response processes from the servers are 
denoted by Ri(t), whereas the total cumulative response 
process from the data center is denoted by R(t) = 
J2i Ri{t)- All arrival and response processes are assumed 
to be non-negative, non-decreasing, and left-continuous, 
and satisfy the initial condition A(Q) = R(0) = 0. For 
convenience we use the bivariate extensions A(s, t) := 
A(t) - A(s) and R(s,t) := R(t) - R{s). 

The service provided by a server is modeled in terms 
of probabilistic lower bounds using the concept of a 
stochastic service process. This is a bivariate random pro- 
cess S(s, t) which is non-negative, non-decreasing, and 
left-continuous. Formally, a server is said to guarantee 
a (stochastic) service process S(s,t) if for any arrival 
process A(t) the corresponding response process R(t) 
from the server satisfies for all t > 

R(t) > A* S(t) , (6) 

where V denotes the min-plus convolution operator, i.e., 
for two (random) processes A(t) and S(s, t), 

A * S(t) := inf {A(s) + S(s, t)} . (7) 

0<s<t 

The inequality in (|6} is assumed to hold almost surely. 
Note that the lower bound set by the service process is 
invariant to the arrival processes. 

We are now ready to state the aggregation property. 
The proof is deferred to Section [5] 

Lemma 1: Consider an arrival process A(t) which is 
dispatched to n servers. Each server i is work-conserving 
with constant rate capacity [i > 0. Arrivals are dis- 
patched deterministically across the servers such that 
each server i receives a fraction - of the arrivals. Then, 
the system has service process S(s,t) = n[i(t — s), i.e., 
R(t) > A* S(t). 

The significance of the Lemma is that if the SLA is 
verified for the virtual server with arrival process A(t) 
and service process S(s,t), then the SLA is verified for 
each of the n servers. 

3.2 Arrival Processes 

Now that we can reduce the study of the multi-server 
system to the study of a single server system using 
Lemma [l] we can move to characterizing the impact of 
the arrival process on the SLA constraint in the Data 
Center Optimization Problem. 

In particular, the next step in deriving the SLA con- 
straint n > Ci - D ^ is to derive a bound on the distribution 
of the delay at the virtual server with arrival process A(t) 
and service process S(s,t) = C(D,e)(t — s), i.e., 

p(l>(t) >£>)<£. (8) 

It is importation to observe that the violation probability 
e holds for the transient delay process D(t), which is 
defined as D(t) := inf {d : A(t - d) < R(t)}, and which 
models the delay spent in the system by the job leaving 
the system, if any, at time t. However, the violation 




heavy-tailed 
MM 

Poisson 



time slot (1 sec) 

Fig. 1. Three synthetically generated traces within 1 
frame, with A = 300 of Poisson, Markov-Modulated (MM) 
(T = 1, A; = 0.5A and X h = 2A), and heavy-tailed arrivals 
(6 = A/3 and a = 1.5). 

probability e is derived so that it is time invariant, which 
implies that it bounds the distribution of the stead-state 
delay D = lim^oo D(t) as well. Therefore, the value of 
C(D,e) can be finally computed by solving the equation 

e — e. 

In the following, we follow the outline above to com- 
pute C(D, e) for light- and heavy-tailed arrival processes. 
Figure [T] depicts examples of the three types of arrival 
processes we consider in 1 frame: Poisson, Markov- 
Modulated (MM), and heavy-tailed arrivals. In all three 
cases, the mean arrival rate is A = 300. The figure clearly 
illustrates the different levels of burstiness of the three 
traces. 

3.2. 1 Light-tailed Arrivals 

We consider two examples of light-tailed arrival pro- 
cesses: Poisson and Markov-Modulated (MM) processes. 
The latter is particularly interesting since it enables the 
adjustment of the burstiness level. 

Following the theory of the stochastic network cal- 
culus, the computation of a probabilistic bound on the 
transient delay D(t) from Eq. (|8} reduces to 

P (D(t) > D) = ¥(A(t- D) > R(t)) 

<P( sup {A(a,t-D) -S{s,t)} >0] , (9) 

where S(s,t) = C(D,e)(t — s). In the last line we used 
the definition of the service process from Eq. This 
argument parallels that of |21|, |29|, except that it adds 
more generality by using the service process S(s,t), 
which can capture a broad range of scheduling and / or 
dispatching policies as constructed in Lemma [T] The last 
term is generally bounded, in both the large deviations 
and stochastic network calculus literatures, by 

t-B 

Y,V(A(s,t-D)-S(8,t)>0) , (10) 

where the sum can be shown to converge in the case 
of light-tailed arrivals. This bound, which is a direct 
application of Boole's inequality, is not so bad if the 
variables A[s,t — D\ — S(s,t) are rather uncorrelated 
(e.g., the Poisson case), but it is quite loose if they are 
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highly correlated (e.g., the MM case with high burstiness 
levels) 1531 . 

To derive tight bounds, especially in the MM case, we 
shall rely instead on maximal inequalities which provide 
refined bounds on the last term in Eq. j9|. Such refined 
bounds have been derived for queueing systems with 
various classes of Markov arrival processes |15|, [28J, [9]. 

Poisson Arrivals 

We start with the case of Poisson processes, which are 
characterized by a low level of burstiness, due to the 
independent increments property. 

Let A{t) be a Poisson process with some rate A > 0, 
and define 



sup 



> 



A 



e e -I) < C{D,e) 



(11) 



Then a bound on the transient delay process is given for 
all £ > by 



(£>(*) > D) < e- e ' c ^ D ^ D := e 



The proof of this equation is deferred to Section [5] 
Solving for C(D, e) by setting the violation probability e 
equal to e yields the implicit solution 



C(D,e) 



9*D 



loge 



Further, using the monotonicity of the function 
^ (e 9 — l) in > we immediately get the explicit 
solution 



C{D,e) = 



K 



where 



K = - 



log (1 + if) 
loge 



A 



(13) 



XD 



The alternative to using Eq. 112} is to use an exact re- 
sult for the steady-state delay distribution at an M/D/1 
queue, i.e., 11201 



D > D \ = 1 — (1 — p)e 



BC(D,e) - • 



(14) 

where p = -g. denotes the utilization factor and 
[a; J denotes the largest integer less than or equal to 
x. This exact formula poses numerical complications 
when p approaches one, due to the appearance of large 
alternating, very nearly cancelling terms (note that the 
factor jp — XD is negative). The difference between the 
exact result (evaluated using a sophisticated numerical 
algorithm |20|) and the bound from Eq. ||l2l is almost in- 
distinguishable. See [13 for a detailed evaluation of the 
tightness. Moreover, unlike the exact result, the closed- 
form and explicit formula from Eq. (13} readily leads 
to qualitative properties on the scaling of the capacity 
C(D,e) (see Theorem [T I. For these reasons we shall use 
the bound from Eq. 1 13 throughout. 



Markov-modulated Arrivals 

Consider now the case of Markov-Modulated (MM) 
processes which, unlike the Poisson processes, do not 
necessarily have independent increments. The key fea- 
ture for the purposes of this paper is that the burstiness 
of MM processes can be arbitrarily adjusted. 

We consider a simple MM processes with two states. 
Let a discrete and homogeneous Markov chain x(s) with 
two states denoted by 'low' and 'high', and transition 
probabilities ph and pi between the 'low' and 'high' 
states, and vice-versa, respectively. Assuming that a 
source produces at some constant rates A; > and 
A/j > A/ while the chain x(s) is in the 'low' and 'high' 
states, respectively, then the corresponding MM cumu- 
lative arrival process is 



= (XlI{ x ( s ) = How'} + Xhl{x(s) = 'high>}) 



(15) 



where /{.} is the indicator function. The average rate of 



(12) A (t) is A 



-A 



h- 



Ph+Pi Ph+Pi' 

To adjust the burstiness level of A(t) we introduce the 
parameter T := ^- + which is the average time for 
the Markov chain x(s) to change states twice. We note 
that the higher the value of T is, the higher the burstiness 
level becomes (the time periods whilst x(s) spends in the 
'high' or 'low' states get longer and longer). 

To compute the delay bound let us construct the 
matrix 



¥(0) 



(1 



-Ph)e 
Pie BX 



ox. 



Ph& 



0\„ 



,e\ h 



for some 8 > and consider its spectral radius 



X(8) := 



(1 - p h )e ax > + (1 - pi)e 0Xh + VA 



where A = ((1 - p h )e eXl - (1 - pi)e eXh ) + 
iphPie e(Xl+Xll) ■ Let also 



K(9) := max 



p h e 9Xh 



X(9) - (1 - p h )e ex i 



X{6)-{l-p h )e ex ^ p h e^ 



The two terms are the ratios of the elements of the right- 
eigenvector of the matrix ^>(6). Also, let 



9* := sup {6 > : - log X(6) < C(D, e) 



Then a bound on the transient delay process imme- 
diately follows from the corresponding backlog bound 
(see Hz pp- 340) because of the constant rate service 
assumption 



"(Z)(i) > D) < K(6*)e 



8'C(D,e)D ._ 



Setting the violation probability e equal to e yields the 
implicit solution 



C(D, e) = log -^r- 

y ' ' 9*D 5 K(6* 



(16) 
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Fig. 2. Illustration of the traces used for numerical experiments. 



3.3 Heavy-tailed and Self-similar Arrivals 

We now consider the class of heavy-tailed and self- 
similar arrival processes. These processes are funda- 
mentally different from light-tailed processes in that 
deviations from the mean increase in time and decay 
in probability as a power law, i.e., more slower than the 
exponential. 

To capture the properties of heavy-tailed and self- 
similar arrivals A(t), we use the following statistical en- 
velope model, called /zfss-envelope [24J for all < s < t 
and a > 



p(A{s, t) > A(f - s) + a(t - s) H ^ 



< Ka- 



il?) 



Here, A is the long-term arrival rate. The tail index —a 
describes the shape of the tail and smaller values of a 
indicate heavier tails. We are interested in the case when 
a £ (1, 2), i.e., arrivals with finite mean but infinite vari- 
ance. The self-similarity index H (or Hurst parameter) 
satisfies H £ [1/a, 1) and describes the arrival processes 
behavior when rescaling time. The burst parameter a 
captures the tail behavior, and K > is a constant. 

The key characteristic of the /zfss-envelope is that the 
error function e(a) = Ka~ a is given by a power law. 
This means that the arrivals A(s, t) may deviate from 
the mean A(t — s) by some very large values with 
non-negligible probabilities; moreover, these deviations 
increase as a function of time due to self -similarity |24|. 

As a concrete example of heavy-tailed and self -similar 
arrivals, consider the case of a source generating jobs in 
every slot according to i.i.d. Pareto random variables Xi 
with tail distribution for all x > b: 



>(X l >x) = (x/b)- a 



(18) 



where 1 < a < 2. X has finite mean E[X] — ab/(a — 1) 
and infinite variance. The corresponding /zfss-envelope 
has the parameters |24| 



A = E[X], a, H 



K w 1 



The bound on the transient delay is then 



D(t) > D) < K (C(D,e)D) 



:= e 



(19) 
(20) 



where 

K = 



1<7< 




«7" 



This bound is state-of-the-art for the delay distribution 
in non-asymptotic regimes (i.e., for finite D). The bound 
is asymptotically tight in that it captures the exact decay 
exponent; however, the bound can be numerically loose 
due to the pref actor K. See |24| for a detailed discussion 
of the tightness of the bound. 

Setting the violation probability e equal to e, we get 
the implicit solution 



inf 



1<7< 



cqji 1 c(D,e)<*-i(c(D,e)- 7 \) log 7 £ 



D > D 



(a - l)log7 



j. = eD a ~ 
(21) 

It is important to point out that the value of K from 
Eq. |T9| is an approximation (for technical justifications 
see |24j). However, there exists a constant value K which 
is uniformly bounded in s and t (see Proposition 1.1.1 
in Il36l ); consequently, the scaling law to be determined 
in Eq. d23l holds. 

The alternative to using Eq. p0| is to use large devia- 
tions results for the steady-state delay distribution |32|: 

_A_L_ (0B) -, m 

where f(x) ~ g(x) stands for lirria-^oo f(x)/g(x) = 1. 
Such asymptotic approximations, however, can be inac- 
curate for typical D of interest [2 J. Hence, we limit our 
focus to Eq. |20| , which provides an upper bound on the 
delay distribution in non-asymptotic regimes, though the 
same qualitative results follow when using Eq. ( p2| . 

4 Case Studies 

Given the model described in the previous two sections, 
we are now ready to explore the potential of dynamic 
resizing in data centers, and how this potential depends 
on the interaction between non-stationarities at the slow 
time-scale and burstiness / self -similarity at the fast time- 
scale. Our goal in this section is to provide insight into 
which workloads dynamic resizing is valuable for. To 
accomplish this, we provide a mixture of analytic results 
and trace-driven numerical simulations in this section. 

It is important to note that the case studies that follow 
depend fundamentally on the modeling performed so 
far in the paper, which allows us to capture and adjust 
independently, both fast time-scale and slow time-scale 
properties of the workload. The generality of our model 
framework enables thus a rigorous study of the impact 
of the workload on value of dynamic resizing. 
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Fig. 4. Impact of burstiness on provisioning n k for MM arrivals. 




400 500 600 
time frame k (10 mins) 

(b) MSR 



4.1 Setup 

Throughout the experimental setup, our aim is to choose 
parameters that provide conservative estimates of the 
case savings from dynamic resizing. Thus, one should 
interpret the savings shown as a lower-bound on the 
potential savings. 

Model Parameters 

The time frame for adapting the number of servers n k 
is assumed to be 10 min, and each time slot is assumed 
to be 1 s, i.e., U = 600. When not otherwise specified, 
we assume the following parameters for the data center 
SLA agreement: the delay upper bound D = 200ms, and 
the delay violation probability e — 10~ 3 . 

The cost is characterized by the two parameters of 
Cq and ei, and the switching cost /?. We choose units 
such that the fixed energy cost is eo = 1. The load- 
dependent energy consumption is set to e\ = 0, because 
the energy consumption of current servers with typical 
utilization level is dominated by the fixed costs (U, 
H51 , 11231 . Note that adjusting e and e x changes the 
magnitude of potential savings under dynamic resizing, 
but does not affect the qualitative conclusions about the 
impact of the workload. So, due to space constraints, we 
fix these parameters during the case studies. 

The normalized switching cost /3/eo measures the 
duration a server must be powered down to outweigh 
the switching cost. Unless otherwise specified, we use 
(3 = 6, which corresponds to the energy consumption for 
one hour (six frames). This was chosen as an estimate of 
the time a server should sleep so that the wear-and-tear 
of power cycling matches that of operating [7], [26 J. 

Workload Information 

The workloads for these experiments are drawn from 
two real-world data center traces. The first set of traces 



is from Hotmail, a large email service running on tens of 
thousands of servers. We used traces from 8 such servers 
over a 48-hour period, starting at midnight (PDT) on 
Monday August 4 2008 |35|. The second set of traces 
is taken from 6 RAID volumes at MSR Cambridge. The 
traced period was 1 week starting from 5PM GMT on 
the 22nd February 2007 11351 . Thus, these activity traces 
represent a service used by millions of users and a 
small service used by hundreds of users. The traces are 
normalized as peak load \ pea k = 1000, and are visualized 
in Figure [2] Both sets of traces show strong diurnal 
properties and have peak- to-mean ratios (PMRs) of 1.64 
and 4.64 for Hotmail and MSR respectively. Loads were 
averaged over disjoint 10 minute frames. 

The traces provide information for the slow time-scale 
model. To parameterize the fast time-scale model, we 
adapt the workload based on the mean arrival rate in 
each frame, i.e., A. To parameterize the MM processes, 
we take A; = 0.5A, Xh = 2A, and we adjust the burst 
parameter T while keeping A fixed for each process. To 
parameterize the heavy-tailed processes, we adjust the 
tail index a for each process, and b in ( |i~8"} is adapted 
accordingly in order to keep the mean fixed at A. Unless 
otherwise stated, we fix a = 1.5 and T = 1. 

Comparative Benchmark 

We contrast three designs: (i) the optimal dynamic resiz- 
ing, (ii) dynamic resizing via LCP, and (iii) the optimal 
'static' provisioning. 

The results for the optimal dynamic resizing should 
be interpreted as characterizing the potential of dynamic 
resizing. But, realizing this potential is a challenge that 
requires both sophisticated online algorithms and excel- 
lent predictions of future workloadsF] 

1. Note that short-term predictions of workload demand within 24 
hours can be quite accurate | IS i, [23 1. 
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Fig. 6. Impact of burstiness on the performance of LCP 
in the Hotmail trace. 

The results for LCP should be interpreted as one exam- 
ple of how much of the potential for dynamic resizing 
can be attained with an online algorithm. One reason 
for choosing LCP is that it does not rely on predicting 
the workload in future frames, and thus provides a 
conservative bound on the achievable cost savings. 

The results for the optimal static provisioning should 
be taken as an optimistic benchmark for today's data 
centers, which typically do not use dynamic resizing. 
We consider the cost incurred by an optimal static 
provisioning scheme that chooses a constant number of 
servers that minimizes the costs incurred based on full 
knowledge of the entire workload. This policy is clearly 
not possible in practice, but it provides a very conservative 
estimate of the savings from right-sizing since it uses 
perfect knowledge of all peaks and eliminates the need 
for overprovisioning in order to handle the possibility of 
flash crowds or other traffic bursts. 

4.2 Results 

Our experiments are organized to illustrate the impact 
of a wide variety of parameters on the cost savings at- 
tainable via dynamic resizing. The goal is to understand 
for which workloads dynamic resizing can provide large 
enough cost savings to warrant the extra implementation 
complexity. Remember, our setup is designed so that the 
cost savings illustrated is a conservative estimate of the 
true cost savings provided by dynamic resizing. 

The Role of Burstiness 

A key goal of our model is to expose the impact of bursti- 
ness on dynamic resizing, and so we start by focusing 



on that parameter. Recall that we can vary busrtiness in 
both the light-tailed and heavy-tailed settings using T 
for MM arrivals and a for heavy-tailed arrivals. 

The impact of burstiness on provisioning: A priori, 
one may expect that burstiness can be beneficial for dy- 
namic resizing, since it indicates that there are periods of 
low load during which energy may be saved. However, 
this is not actually true since resizing decisions must be 
made at the slow time-scale while burstiness is a charac- 
teristic of the fast time-scale. Thus, burstiness is actually 
detrimental for dynamic resizing, since it means that 
the provisioning decisions made on the slow time-scale 
must be made with the bursts in mind, which results in 
an larger number of servers needed to be provisioned 
for the same average workload. This effect can be seen 
in Figures [3] and |4j which show the optimal dynamic 
provisioning as a and T vary. Recall that burstiness 
increases as a decreases and T increases. 

The impact of burstiness on cost savings: The larger 
provisioning created by increased burstiness manifests 
itself in the cost savings attainable through dynamic 
capacity provisioning as well. This is illustrated in Figure 
|5j which shows the cost savings of the optimal dynamic 
provisioning as compared to the optimal static provision- 
ing for varying a and T as a function of the switching 
cost p. 

The impact of burstiness on LCP: Interestingly, 
though Figure [5] shows that the potential of dynamic re- 
sizing is limited by increased burstiness, it turns out that 
the relative performance of LCP is not hurt by burstiness. 
This is illustrated in Figure |6j which shows the percent of 
the optimal cost savings that LCP achieves. Importantly, 
it is nearly perfectly flat as the burstiness is varied. 

The Role of the Peak-to-Mean Ratio 

The impact of the peak-to-mean ratio on the potential 
benefits of dynamic resizing is quite intuitive: if the 
peak-to-mean ratio is high, then there is more oppor- 
tunity to benefit from dynamically changing capacity. 
Figure [7] illustrates this well-known effect. The workload 
for the figure is generated from the traces by scaling 
as Afe = c(Afc) 7 , varying 7 and adjusting c to keep the 
mean constant. 

In addition to illustrating that a higher peak-to-mean 
ratio makes dynamic resizing more valuable, Figure [7] 
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also highlights that there is a strong interaction between 
burstiness and the peak-to-mean ratio, where if there is 
significant burstiness the benefits that come from a high 
peak-to-mean ratio may be diminished considerably. 

The Role of the SLA 

The SLA plays a key role in the provisioning of a data 
center. Here, we show that the SLA can also have a 
strong impact on whether dynamic resizing is valuable, 
and that this impact depends on the workload. Recall 
that in our model the SLA consists of a violation proba- 
bility e and a delay bound D. We deal with each of these 
in turn. 

Figures [8] and [9] highlight the role the violation proba- 
bility e has on the provisioning of n k under the optimal 
dynamic resizing in the cases of heavy-tailed and MM 
arrivals. Interestingly, we see that there is a significant 
difference in the impact of e depending on the arrival 
process. As e gets smaller in the heavy-tailed case the 
provisioning gets significantly flatter, until there is al- 
most no change in n k over time. In contrast, no such 
behavior occurs in the MM case and, in fact, the impact 
of e is quite small. This difference is a fundamental 
effect of the "heaviness" of the tail of the arrivals, i.e., a 
heavy tail requires significantly more capacity in order 
to counter a drop in e. 



This contrast between heavy- and light-tailed arrivals 
is also evident in Figure [TTJ which highlights the cost 
savings from dynamic resizing in each case as a function 
of e. Interestingly, the cost savings under light-tailed 
arrivals is largely independent of e, while under heavy- 
tailed arrivals the cost savings is monotonically increas- 
ing with e. 

The second component of the SLA is the delay bound 
D. The impact of D on provisioning is much less dra- 
matic. We show an example in the case of heavy- tailed 
arrivals in Figure 10 Not surprisingly, the provisioning 



increases as D drops. However, the flattening observed 
as a result of e is not observed here. The case of MM 
arrivals is qualitatively the same, and so we do not 
include it. 

When is Dynamic Resizing Valuable? 

Now, we are finally ready to address the question of 
when (i.e., for what workloads) is dynamic resizing 
valuable. To address this question, we must look at 
the interaction between the peak-to-mean ratio and the 
burstiness. Our goal is to provide a concrete under- 
standing of for which (peak-to-mean, burstiness, SLA) 
settings the potential savings from dynamic resizing is 
large enough to warrant implementation. Figures 



12 



14 



focus on this question. Our hope is that these figures 
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Fig. 11. Impact of e on the cost savings of dynamic 
resizing. 
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Fig. 1 3. Characterization of burstiness and peak-to-mean 
ratio necessary for dynamic resizing to achieve 20% cost 
reduction as a function of the switching cost, /3. 
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Fig. 1 2. Characterization of burstiness and peak-to-mean 
ratio necessary for dynamic resizing to achieve different 
levels of cost reduction. 

highlight that a precursor to any debate about the value 
of dynamic resizing must be a joint understanding of the 
expected workload characteristics and the desired SLA, 
since for any fixed choices of two of these parameters 
(peak-to-mean, burstiness, SLA), the third can be cho- 
sen so that dynamic resizing does or does not provide 
significant cost savings for the data center. 

set of curves 



Starting with Figure 12 



we see a set ot curves for 
different levels of cost savings. The interpretation of the 
figures is that below (above) each curve the savings 
from optimal dynamic resizing is smaller (larger) than 
the specified value for the curve. Thus, for example, 
if the peak-to-mean ratio is 2 in the Hotmail trace, a 
10% cost savings is possible for all levels of burstiness, 
but a 30% cost savings is only possible for a > 1.5. 
However, if the peak-to-mean ratio is 3, then a 30% 
cost savings is possible for all levels of burstiness. It 
is difficult to say what peak-to-mean and burstiness 
settings are "common" for data centers, but as a point of 
reference, one might expect large-scale services to have a 
peak-to-mean ratio similar to that of the Hotmail trace, 
i.e., around 1.5-2.5; and smaller scale services to have 
peak-to-mean ratios similar to that of the MSR trace, i.e., 
around 4-6. The burstiness also can vary widely, but as a 
rough estimate, one might expect a to be around 1.4-1.6. 

Of course, many of the settings of the data center will 
effect the conclusions illustrated in Figure [12] Two of the 
most important factors to understand the effects of are 
the switching cost, f3, and the SLA, particularly e. 
Figure 13 highlights the impact of the magnitude of 
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Fig. 1 4. Characterization of burstiness and peak-to-mean 
ratio necessary for dynamic resizing to achieve 20% cost 
reduction as a function of the SLA, e. 



curves represent the threshold on peak-to-mean ratio 
and burstiness necessary to obtain 20% cost savings 
from dynamic resizing. As the switching costs increase, 
the workload must have a larger peak-to-mean ratio 
and /or less burstiness in order for dynamic resizing to 
be valuable. This is not unexpected. However, what is 
perhaps surprising is the small impact played by the 
switching cost. The class of workloads where dynamic 
resizing is valuable only shrinks slightly as the switching 
cost is varied from on the order of the cost of running a 
server for 10 minutes (/? = 1) to running a server for 3 
hours (f3 = 18). 

Interestingly, while the impact of the switching costs 
on the value of dynamic resizing is small, the impact 
of the SLA is quite large. In particular, the violation 
probability e can dramatically affect whether dynamic 
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the switching costs on the value of dynamic resizing. The 



resizing is valuable or not. This is shown in Figure 
on which the curves represent the threshold on peak 
to-mean ratio and burstiness necessary to obtain 20% 
cost savings from dynamic resizing. We see that, as 
the violation probability is allowed to be larger, the 
impact of the peak-to-mean ratio on the potential of 
savings from dynamic resizing disappears; and the value 
of dynamic resizing starts to depend almost entirely 
on the burstiness of the arrival process. The reason for 
this can be observed in Figure |8j which highlights that 
the optimal provisioning n/- becomes nearly flat as e 
increases. 
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Supporting Analytic Results 

To this point we have focused on numerical simulations, 
and further we provide analytic support for the behavior 
we observed in the experiments above. In particular, 
the following two theorems characterize the impact of 
burstiness and the SLA (D,e) on the value of dynamic 
resizing under Poisson and heavy-tailed arrivals. This is 
accomplished by deriving the effect of these parameters 
on C(D,e), which constrains the optimal provisioning 
nfc. A smaller (larger) C(D.e) implies a smaller (larger) 
provisioning n^, which in turn implies smaller (larger) 
costs. 

We start providing a result for the case of Poisson 
arrivals. The proof is given in Section [5] 

Theorem 1: The service capacity constraint from 
Eq. 1 13 1 increases as the delay constraint D or the 
violation probability E decrease. It also satisfies the 
scaling law 

C ^° e Go g (fi-4e---) 

as D^ 1 loge -1 — > oo. 

This theorem highlights that as E decreases and / or 
D decreases C(D,e), and thus the cost of the optimal 
provisioning, increases. This shows that the observations 
made in our numeric experiments hold more generally. 
Perhaps the most interesting point about this theorem, 
however, is the contrast of the growth rate with that in 
the case of heavy-tailed arrivals, which is summarized 
in the following theorem. The proof is given in Section 


Theorem 2: The implicit solution for the capacity con- 
straint from Eq. 1 21 1 increases as the delay constraint D 
or the violation probability E decrease, or the value of a 
decreases. It also satisfies the scaling law 



C(D,e) = 6 



eD c 



(23) 



as ED a ~ 1 — > for any given a € (1,2). 

A key observation about this theorem is that the 
growth rate of C(D, e) with E is much faster than in the 
case of the Poisson (polynomial instead of logarithmic). 
This supports what is observed in Figure [ill Addi- 



tionally, Theorem [2] highlights the impact of burstiness, 
a, and shows that the behavior we have seen in our 
experiments holds more generally. 

5 Proofs 

In this section, we collect the proofs for the results in 
previous sections. 

We start with the proof of Lemma [TJ the aggregation 
property used to model the multiserver system with a 
single service process. 

Proof of Lemma [T[ Fix t > 1. Because each server i 
has a constant rate capacity [i, it follows that the bivariate 



processes Si(s, t) = fi(t — s) are service processes for the 
individual servers (see 0, pp- 167), i.e., 



Ri(t) 



> 



inf {Ai(s) + n{t - s)} 

0<s<t 



= - inf { A(s) + nu(t 
n o<s<t 



0} 



where Ri(t) is the departure process from server i. In 
the last line we used the load-balancing dispatching 
assumption, i.e., Aj(s) = -A(s). Adding the terms for 
i = 1, . . . , n it immediately follows that 

Ri(t) > M <s<t {A(s) + nn(t - a)}, 

which shows that the bivariate process S(s, t) = nfi(t—s) 
is a service process for the virtual system with arrival 
process A(t) = 2~2i-Ai(t) and departure process R(t) = 

£i*i(*)- ^ ^ □ 

Next, we prove the bound used for Poisson arrival 
processes, i.e., E q. <(T2) . 

Proof of Eq. |l2|: The proof follows closely a tech- 
nique from 1 22 J for the analysis of GI/GI/1 queues. 
Denote for convenience C = C(D,e). Fix t > 1 and 
introduce the following process for all < s < t — D: 

T(s) = e e *( A (t-D-s,t-D)-Cs)_ 

Consider also the filtration of er-algebras 

T a =a{A(t-D-s,t-D)} , 

i.e., T s C F s+U for all < s < s + u < t - D. Note that 
T(s) is F s -measurable for all s (see [33J, pp. 79). Then we 
can write for the conditional expectations for all s, u > 
with s + u <t - D 

rpf s \ e e*(A(t-D-s-u,t-5-s)-Cu 



E[T(s + u) || F 8 ] -- 
= T{s)E 
= T(s)E 



E 



e 6*( y A{t-D-s-u,t-D-s)-Cu) 
6* (A(t-D-s-u,t-D-s)-Cu) 
(A(e°*-1)-C)u 



= T{s)e \ w 
< T(s) . 

In the second line we used that T(s) is J 7 , -measurable, 
and then we used the independent increments property 
of the Poisson process A(t), i.e., A(t — D — s — u,t — D — s) 
is independent of .F s . Then we computed the moment 
generating function for the Poisson process, and finally 
we used the property of 8* from Eq. | [TT| . Therefore, the 
process T(s) is a supermartingale, i.e., 

E[T{s + u) || F s ] < T(s). 

We can now continue Eq. ||9} as follows 

P (D(t) > D) 



< 



< 



< e 



sup {^4(s,i- 

\0<s<t-D 

I sup T(s) > e' 

\0<s<t-D 
-8*CD 



D) 



"CD 



C7(i -£>-«)} > CD 
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which proves Eq. | [12) , Note that in the last line we used a 
maximal inequality for the (continuous) supermartingale 
T(a) (see |33|, pp. 54). ' □ 

Finally, we prove the monotonicity and scaling results 
in Theorems [T] and |2] 

Proof of Theorem [I] First, note that the monotonicity 
properties follow immediately from the fact that the 
function f(x) = (1 + x)* is non-decreasing. Next, to 
prove the more detailed scaling laws, simply notice that 
log (1 + c/(n)) = 0(log/(n)) for some non-decreasing 
function f(n) and a constant c > 0. The result follows. 

□ 

Proof of Theorem [2j 

We first consider the monotonicity properties and then 
the scaling law. 

Monotonicity properties: To prove the monotonicity re- 
sults on D and e, observe that the left hand side (LHS) 
in the implicit equation from Eq. < |2~l) is a non-increasing 
function in C(D, e) because the range of the infimum 
expands whereas the function in the infimum decreases, 
by increasing C(D,e). Moreover, the LHS is unbounded 
at the boundary C(D,e) = A. The solution C(D,e) is 
thus non-increasing in both D and e. 

Next, to prove monotonicity in a, fix a% < a-i and 
denote by C\ the implicit solution of Eq. j2l"} for a = a,\. 
In the first step we prove that C\D > 1. Let C be the 
solution of 

1 



where the LHS was obtained by relaxing the LHS of 
Eq. |2l| (we used that 7 > 1, C — 7 A < C, and x > log x 
for all x > 1). Consequently, C\ > C, and by assuming 
that the units are properly scaled such that D > 1, it 
follows that CD>1 and hence C X D > 1. 

Secondly, we prove that 7, i.e., the optimal value in 
the solution of C\, satisfies 7 < e. Consider the function 

f(l) = alogi with a = ^T- If ' b y contradiction, 7 > e, 
then /'(7) > and consequently f(j) is increasing on 
[e, 00). Since the function is also increasing in 7, 

we get a contradiction that 7 is the optimal solution as 
assumed, and hence 7 < e. 

Finally, consider the function g(a) 



7 

a log 7 



with a 



g -^- The previous property 7 < e implies that g' (a) < 
and further that g is non-increasing in a and hence in a 
as well. Since ( C * s a ^ so non-increasing in a, we 



obtain that 



inf 



7 



i< 7 <% \ (C 1 D)^(C 1 - 1 X) 

7 



log7~ 



7 



Using the monotonicity in C\ in the term inside the 
infimum, it follows that C\ > C2, where C2 is the implicit 
solution of Eq. (21 1 for a = ol\. Therefore, C(D, e) is non- 
increasing in a. 



Scaling law: To prove the scaling law, denote by C the 
implicit solution of the equation 



inf 

l<7<max ^ ^" 



C a log 



= ED C 



(24) 



7" 



The LHS here was constructed by relaxing the function 
inside the infimum in the LHS of Eq. | |2"T} and extending 
the range of the infimum. This means that the implicit 
solution C is smaller than the implicit solution C(D,e). 
The function inside the infimum of the LHS of Eq. | pl| 
is convex on the domain of 7 and attains its infimum 
at 7 = e^^i. Solving for C and using that C(D,e) > C 
proves the lower bound. 

To prove the upper bound, let us fix a, Dq and £q, and 
denote by Co ( -Do, £0) the corresponding implicit solution. 
Using the monotonicity of the implicit solution in e_D Q_1 , 
as shown above, it follows that 

C(D,s) > C Q (D ,e ) , 



where eD"- 1 < eqD^- 1 . Fixing 7o = Co(£> °f o)+A , let C 



be the solution of the equation 



7o 



7o 



= eD c 



(25) 



Because the range of 7 in the solution of C(D, e) includes 
70, it follows that C{D,e) < C. On the other hand, the 
LHS of Eq. |25} satisfies 



To 



7o 



C*»-i (C - 7o A) 



< 



log7o' 



-^0 



(26) 



where K tt = 7oCo(5 °' go) ^ 



Ca(D ,e )--i \ 



n-. Here we used that 



To 



C > C (D ,e ) (note that we showed before that C > 
C(D,e) and C(D,e) > C (D ,e )). Finally, combining 
Eqs. p5| and p6| l we immediately get the scaling law 

, and since C(D,e) < C the proof is 

□ 



C = 

complete. 



6 Conclusion 

Our goal in this paper is to provide new insight into 
the debate about the potential of dynamic resizing in 
data centers. Clearly, there are many facets of this issue 
relating to the engineering, algorithmic, and reliability 
challenges involved in dynamic resizing which we have 
ignored in this paper. These are all important issues 
when trying to realize the potential of dynamic resizing. 
But, the point we have made in this paper is that 
when quantifying the potential of dynamic resizing it is 
of primary importance to understand the joint impact of 
workload and SLA characteristics. 

To make this point, we have presented a new model 
that captures the impact of SLA characteristics in ad- 
dition to both slow time-scale non-stationarities in the 
workload and fast time-scale burstiness in the workload. 
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This model allows us to provide the first study of 
dynamic resizing that captures both the stochastic bursti- 
ness and diurnal non-stationarities of real workloads. 
Within this model, we have provided both trace-based 
numerical case studies and analytical results. Perhaps 
most tellingly our results highlight that even when two 
of SLA, peak-to-mean ratio, and burstiness are fixed, the 
other one can be chosen to ensure that there either are or 
are not significant savings possible via dynamic resizing. 
Figures [I2p4 illustrate how dependent the potential of 
dynamic resizing is on these three parameters. These 
figures highlight that a precursor to any debate about 
the value of dynamic resizing must be an understanding 
of the workload characteristics expected and the SLA 
desired. Then, one can begin to discuss whether this 
potential is obtainable. 

Future work on this topic includes providing a more 
detailed study of how other important factors affect the 
potential of dynamic resizing, e.g., storage issues, relia- 
bility issues, and the availability of renewable energy. 
Note that provisioning capacity to take advantage of 
renewable energy when it is available is an important 
benefit of dynamic resizing that we have not considered 
at all in the current paper. 
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